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^vq . The highway of the development of entropy is marked by many great names, for example, Clausius, 

' Gibbs, Boltzmann, Szilard, von Neumann, Shannon, Jaynes, and several others. In this article the 

emphasis is put on von Neumann and on quantum mechanics. The selection of the subjects reflects the 
["T i ' taste (and the knowledge) of the author and it must be rather restrictive. In the past 50 years entropy has 

broken out of thermodynamics and statistical mechanics and invaded communication theory, ergodic 
theory and shown up in mathematical statistics, social and life sciences. It is practically impossible 
to present all of its features. The favourite subjects of entropy is about macroscopic phenomena, 
irreversibility and incomplete knowledge. In the strictly mathematical sense entropy is related to the 
asymptotics of probabilities or it is a kind of asymptotic behaviour of probabilities. 

This paper is organized as follows. After a short introduction to entropy, von Neumann's gedanken 
experiment is repeated, which led him to the formula of thermodynamic entropy of a statistical operator. 
In the analysis of his ideas we stress the role of (the lack of) superselection sectors and summarize von 
Neumann's knowledge about quantum mechanical entropy. The final part of this article is devoted 
to some important developments of the von Neumann entropy which were discovered long after von 
i-C ■ Neumann's work. Subadditivity and interpretation of the von Neumann entropy as the capacity of a 

communication channel are among those. 
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1 General introduction to entropy 



The word "entropy" was created by Rudolf Clausius and it appeared in his work "Abhandlungen iiber 
die mechanische Warmetheorie" published in 1864. The word has a Greek origin, its first part reminds 
us of "energy" and the second part is from "tropos" which means turning point. Clausius' work is the 
foundationstone of classical thermodynamics. According to Clausius, the change of entropy of a system 
is obtained by adding the small portions of heat quantity received by the system divided by the absolute 
temperature during the heat absorption. This definition is satisfactory from a mathematical point of 
view and gives nothing other than an integral in precise mathematical terms. Clausius postulated 
that the entropy of a closed system cannot decrease, which is generally referred to as the second law 
of thermodynamics. On the other hand, he did not provide any heuristic argument to support the 
law. This fact might partly be responsible for the mystery surrounding entropy for a long time. As an 
extreme, we can cite Alfred Wehrl who had the opinion in 1978 that "the second law of thermodynamics 
does not appear to be fully understood yet" Q . 

The concept of entropy was really clarified by Ludwig Boltzmann. His scientific program was to 
deal with the mechanical theory of heat in connection with probabilities. Assume that a macroscopic 
system consists of a large number of microscopic ones, we simply call them particles. Since we have 
ideas of quantum mechanics in mind, we assume that each of the particles is in one of the energy levels 
E\ < E2 < ■ ■ . < E m . The number of particles in the level Ei is Ni, so ^ N{ = N is the total number of 
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particles. A macrostate of our system is given by the occupation numbers N%, N2, ■ ■ ■ , N m . The energy 
of a macrostate is E = J^i NiEi. A given macrostate can be realized by many configurations of the N 
particles, each of them at a certain energy level E^. Those configurations are called microstates. Many 
microstates realize the same macrostate. We count the number of ways of arranging N particles in m 
boxes (i.e., energy levels) such that each box has N\, N2, ■ ■ . , N m particles. There are 

n ^ m 



Nu^'-^Nm J ■ N 1 \N 2 l...N m l 

such ways. This multinomial coefficient is the number of microstates realizing the macrostate 
(Ni,N 2 , ■ ■ ■ ,N m ) and it is proportional to the probability of the macrostate if all configurations are 
assumed to be equally likely. Boltzmann called (|lj) the thermodynamical probability of the macrostate, 
in German "thermodynamische Wahrscheinlichkeit" , hence the letter W was used. Of course, Boltz- 
mann argued in the framework of classical mechanics and the discrete values of energy came from an 
approximation procedure with "energy cells" . 

If we are interested in the thermodynamic limit N increasing to infinity, we use the relative numbers 
Pi := Ni/N to label a macrostate and, instead of the total energy E = iVj-Ej, we consider the average 
energy pro particle E/N — YliPi^i- To find the most probable macrostate, we wish to maximize ([!]) 
under a certain constraint. The Stirling approximation of the factorials gives 

il0g( A r A T N AT )=H(p 1 ,P2,..., Pm ) + 0(N- 1 \0gN), (2) 



where 



H(p!,p 2 , ■ ■ ■ ,Pm) ■= ^2 -Pi l °SPi- ( 3 ) 

i 

If N is large then the approximation (||) yields that instead of maximizing the quantity ([l]) we can 
maximize (^). For example, maximizing (^) under the constraint J^iPi^i = e, we get 



e -\E i 



where the constant A is the solution of the equation 

-XEi 



—XEi 



(4) 



e 



Note that the last equation has a unique solution if Ex < e < E m , and the distribution (Q) is known as 
the discrete Maxwell-Boltzmann law today. 

Let pi,p 2 , ■ ■ ■ ,p n be the probabilities of different outcomes of a random experiment. According to 
Shannon, the expression (|l|) is a measure of our ignorance prior to the experiment. Hence it is also 
the amount of information gained by performing the experiment. (0) is maximum when all the p^s are 
equal. In information theory logarithms with base 2 are used and the unit of information is called bit 
(from binary digit). As will be seen below, an extra factor equal to Boltzmann's constant is included in 
the physical definition of entropy. 



2 Von Neumann's contribution to entropy 

The comprehensive mathematical formalism of quantum mechanics was first presented in the famous 
book "Mathcmatische Grundlagen der Quant enmechanik" published in 1932 by Johann von Neumann. 
In the traditional approach to quantum mechanics, a physical system is described in a Hilbert space: 
Observables correspond to selfadjoint operators and statistical operators are asssociated with the states. 
In fact, a statistical operator describes a mixture of pure states. Pure states are the really physical states 
and they are given by rank one statistical operators, or equivalently by rays of the Hilbert space. 
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Von Neumann associated an entropy quantity to a statistical operator in 1927 [|| and the discussion 
was extended in his book || . His argument was a gedanken experiment on the ground of phenomenolog- 
ical thermodynamics. Let us consider a gas of iV(>- 1) molecules in a rectangular box K. Suppose that 
the gas behaves like a quantum system and is described by a statistical operator D which is a mixture 
\\ipi){ipi\ + (1 — X)\ipi)(if2\, \<fi) = f is a state vector (i = 1, 2). We may take XN molecules in the pure 
state ipi and (1 — X)N molecules in the pure state ip2- On the basis of phenomenological thermodymanics 
we assume that if ipi and if2 are orthogonal, then there is a wall which is completely permeable for the 
iy9i-molecules and isolating for the (^-molecules. (In fact, von Neumann supplied an argument that such 
a wall exists if and only if the state vectors are orthogonal.) We add an equally large empty rectangular 
box K' to the left of the box K and we replace the common wall with two new walls. Wall (a), the one to 
the left is impenetrable, whereas the one to the right, wall (b), lets through the ^-molecules but keeps 
back the (^-molecules. We add a third wall (c) opposite to (b) which is semipermeable, transparent 
for the t/?2-molecules and impenetrable for the t^i-ones. Then we push slowly (a) and (c) to the left, 
maintaining their distance. During this process the tpi -molecules are pressed through (b) into K' and 
the ^-molecules diffuse through wall (c) and remain in K . No work is done against the gas pressure, 
no heat is developed. Replacing the walls (b) and (c) with a rigid absolutely impenetrable wall and 
removing (a) we restore the boxes K and K' and succeed in the separation of the </?i-molecules from 
the (/?2-ones without any work being done, without any temperature change and without evolution of 
heat. The entropy of the original D-gas ( with density N/V ) must be the sum of the entropies of the 
(p%- and (/?2-gases ( with densities XN /V and (1 — X)N/V, respectively. ) If we compress the gases in K 
and K' to the volumes XV and (1 — A)V^, respectively, keeping the temperature T constant by means 
of a heat reservoir, the entropy change amounts to AtAiVlogA and k(1 — X)N log(l — A), respectively. 
Indeed, we have to add heat in the amount of XiN nTlogXi (< 0) when the c^i-gas is compressed, and 
dividing by the temperature T we get the change of entropy. Finally, mixing the ip\- and <y92-gases 
of identical density we obtain a D-gas of N molecules in a volume V at the original temperature. If 
So(ip, N) denotes the entropy of a ip-ga.s of N molecules ( in a volume V and at the given temperature ), 
we conclude that 

S (tpi,\N)+S (<p2,(l-\)N) 

= S (D, N) + kXN log A + «(1 - X)N log(l - A) 

must hold, where k is Boltzmann's constant. Assuming that So(ip, N) is proportional to N and dividing 
by N we have 

ASfai) + (1 - A)S(^) 

= S(D) + kX log A + k(1 - A) log(l - A) , (5) 

where S is certain thermodynamical entropy quantity ( relative to the fixed temperature and molecule 
density ). We arrived at the mixing property of entropy, but we should not forget about the initial 
assumption: ipi and <f2 are supposed to be orthogonal. Instead of a two-component mixture, von 
Neumann operated by an infinite mixture, which does not make a big difference, and he concluded that 

fffy^iiytXyti) = ^Qfiiifti) - ^y^AjiogAj. (6) 

i i i 

Before we continue to follow his considerations, let us note that von Neumann's argument does not 
require that the statistical operator D is a mixture of pure states. What we really needed is the property 
D = XDi + (1 — X)D 2 in such a way that the possible mixed states D\ and D 2 are disjoint. D\ and 
Z?2 arc disjoint in the thermodynamical sense, when there is a wall which is completely permeable for 
the molecules of a Z?i-gas and isolating for the molecules of a £>2-gas. In other words, if the mixed 
states Di and D2 are disjoint, then this should be demonstrated by a certain filter. Mathematically, the 
disjointness of D\ and D2 is expressed in the orthogonality of the eigenvectors corresponding to nonzero 
eigenvalues of the two density matrices. The essential point is in the remark that equation (j|) must 
hold also in a more general situation when possibly the states do not correspond to density matrices 



3 



but orthogonality of the states makes sense: 



XS(D 1 ) + (1 - X)S(D 2 ) 

= S'(£>) + /cAlogA + /t(l-A)log(l-A) (7) 

Equation (||) reduces the determination of the (thermodynamical) entropy of a mixed state to that 
of pure states. The so-called Schatten decomposition Xi\<pi){<pi\ of a statistical operator is not 
unique even if (ipi,(fj) — is assumed for i ^ j. When Xi is an eigenvalue with multiplicity, then 
the corresponding eigenvectors can be chosen in many ways. If we expect the entropy S(D) to be 
independent of the Schatten decomposition, then we are led to the conclusion that S(\<p)(<p\) must be 
independent of the state vector \(p). This argument assumes that there are no superselection sectors, 
that is, any vector of the Hilbert space can be a state vector. On the other hand, von Neumann wanted 
to avoid degeneracy of the spectrum of a statistical operator (as well as the possible degeneracy of the 
spectrum of observables as we shall see below). 

Von Neumann's proof of the property that S(\tp){tp\) is independent of the state vector \ip) was 
different. He did not want to refer to a unitary time development sending one state vector to another, 
because that argument requires great freedom in choosing the energy operator H. Namely, for any \(pi) 
and |(/?2) we would need an energy operator H such that 

e iW V> = \<fi2). 

This process would be reversible. (It is worthwhile to note that the problem of superselection sectors 
appears also here.) 

Von Neumann proved that S (\ipi) {ipi\) < *S'( | ^2 ) {^2 |) by constructing a great number of measure- 
ment processes sending the state \ipi) into an ensemble, which differs from \(f2}{ i P2\ by an arbitrarily 
small amount. The measurement of an observable A = Xi\ipi){ipi\ in a state \<p) yields an ensemble of 
the pure states with weights |(<^|?/>j}| 2 . This was a basic postulate in von Neumann's measure- 

ment theory when the eigenvalues of A are non-degenerate, that is, A^'s are all different. In a modern 
language, von Neumann's measurement is a conditional expectation onto a maximal Abelian subalgebra 
of the algebra of all bounded operators acting on the given Hilbert space. Let (\ipi))i be an orthonormal 
basis consisting of eigenvectors of the observable under measurement. For any bounded operator T we 
set 

B(T)=^(^|T|^)|^)(^|. (8) 

i 

The linear transformation E possesses the following properties: 

(i) E = E 2 . 

(ii) If T > then E{T) > 0. 

(iii) E(I) = I. 

(iv) Tt(E(T)) = TtT. 

In particular, for a statistical operator D its transform E(D) is a statistical operator as well. It follows 
immediately from definition (^) that 

i 

and the conditional expectation E acts on the pure states exactly in the same way as it is described in 
the measurement procedure. It was natural for von Neumann to assume that 

S(D)<S(E(D)), (9) 
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at least if the statistical operator D corresponds to a pure state. Inequality (||) is nothing other than 
the manifestation of the second law for the measurement process. 

In the proof of the inequality S(\ipi}(ipi\) < S (\ip2) {(p2\) one can assume that the vectors \ifi) and 
\ip2) are orthogonal. The idea is to construct measurements E\, E2, ■ ■ ■ , Ek such that 

£?*(... (Sid^iXVJil))...) (10) 

is in a given small neighbourhood of |</?2)(v?2|- The details are well-presented in von Neumann's original 
work, but we confine ourselves here to his definition for E n . He set a unit vector 

,r„\ 7T?T, . . Trn, . 

^ = C ° S 2k + Sm 2k^ 2 ' 
and extended it to a complete orthonormal system. The measurement conditional expectation E n 
corresponds to this basis (1 < n < k). It is elementary to show that ( [To|) tends to |</?2)(</?2| as k — > 00. 
We stress again that the argument needs that \ip\) and \ip2) are in the same superselection sector, so 
that their linear combinations may be state vectors. 

Let us summarize von Neumann's discussion of the thermodynamical entropy of a statistical operator 
D. First of all, he assumed that S(D) is a continuous function of D. He carried out a reversible process 
to obtain the mixing property (||) for orthogonal pure states, and he concluded @. He referred to the 
second law again when assuming (^) for pure states. Then he showed that S(\ip)(tp\) is independent of 
the state vector \ip) so that 

s(j2M<Pi)(<Pi\) =-«^A i logA i (11) 

i i 

up to an additive constant which could be chosen to be as a matter of normalization. (^) is von 
Neumann's celebrated entropy formula; it has a more elegant form 

S(D) = KTrrj{ L D), (12) 

where 77 : M + — ► IR is the continuous function r/(t) = — tlogt. (The modern notation for — tlogt comes 
from information theory which did not exist at that time.) 

When von Neumann deduced (p2h , his natural intention was to make mild assumptions. For example, 
the monotonicity (||) was assumed only for pure states. If we already have (|l^) as a definition, then (^|) 
can be proved for an arbitrary statistical operator D. The argument is based on the Jensen inequality, 
and von Neumann remarked that for 

S f (D)=Trf(D) 

with a differentiable concave function / : [0, 1] — > H, 

S f (D)<S f (E(D)) (13) 

holds for every statistical operator D. His analysis also indicated that the measurement process is 
typically irreversible, the finite entropy of a statistical operator definitely increases if a state change 
occurs. 

Von Neumann solved the maximization problem for S(D) under the constraint TtDH = e. This 
means the determination of the ensemble of maximal entropy when the expectation of the energy 
operator H is a prescribed value e. It is convenient to rephrase his argument in terms of conditional 
expectations. H = H* is assumed to have a discrete spectrum and we have a conditional expectation E 
determined by the eigenbasis of H. If we pass from an arbitrary statistical operator D with Tr DH = e 
to E(D), then the entropy is increasing on the one hand and the expectation of the energy does not 
change on the other hand, so the maximizer should be searched among the operators commuting with 
H. In this way we are (and von Neumann was) back to the classical problem of statistical mechanics 
treated at the beginning of this article. In terms of operators the solution is in the form 

exp(-/3£0 



Tr exp(-0H) 

which is called the Gibbs state today. 
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3 Some topics about entropy from von Neumann to the present 



After Boltzmann and von Neumann, it was Shannon who initiated the interpretation of the quan- 
tity ~^2iPi^°SPi as "uncertainty measure" or "information measure". The American electrical en- 
gineer/scientist Claude Shannon created communication theory in 1948. He posed a problem in the 
following way: 

"Suppose we have a set of possible events whose probabilities of occurence are p\ , p 2 , ■ ■ ■ , p n - 
These probabilities are known but that is all we know concerning which event will occur. 
Can we find a measure of how much "choice" is involved in the selection of the event or how 
uncertain we are of the outcome?" 

Denoting such a measure by H(pi,p 2 , ■ ■ ■ ,p n ) he listed three very reasonable requirements which should 
be satisfied. He concluded that the only H satisfying the three assumptions is of the form 



i=i 

where if is a positive constant. For H he used different names such as information, uncertainty and 
entropy. Many years later Shannon said [Q : 

"My greatest concern was what to call it. I thought of calling it 'information', but the word 
was overly used, so I decided to call it 'uncertainty'. When I discussed it with John von 
Neumann, he had a better idea. Von Neumann told me, 'You should call it entropy, for two 
reasons. In the first place your uncertainty function has been used in statistical mechanics 
under that name, so it already has a name. In the second place, and more important, nobody 
knows what entropy really is, so in a debate you will always have the advantage." 

Shannon's postulates were transformed later into the following axioms: 

(a) Continuity: H(p. 1 — p) is continuous function of p. 

(b) Symmetry: H(pi,p2, ■ ■ ■ ,p n ) is a symmetric function of its variables. 

(c) Recursion: For every < A < 1 the recursion H(jp\, . . . ,p n -i, Ap„, (1 — X)p n ) = H(pi, . . . ,p n ) + 
p n H(X, 1 - A) holds. 

These axioms determine a function if up to a positive constant factor. Excepting the above story 
about a conversation between Shannon and von Neumann, we do not know about any mutual influence. 
Shannon was interested in communication theory and von Neumann's thermodynamical entropy was 
in the formalism of quantum mechanics. Von Neumann himself never made any connection between 
his quantum mechanical entropy and information. Although von Neumann's entropy formula appeared 
in 1927, there was not much activity concerning it for several decades. At the end of the 1960's, the 
situation changed. Rigorous statistical mechanics came into being (!(]] and soon after that the needs 
of rigorous quantum statistical mechanics forced new developments concerning von Neumann's entropy 
formula. 

Von Neumann was aware of the fact that statistical operators form a convex set whose extreme 
points are exactly the pure states. He also knew that entropy is a concave functional, so 



for any convex combination. To determine the entropy of a statistical operator, he used the Schatten 
decomposition, which is an orthogonal extremal decomposition in our present language. For a statistical 
operator D there are many ways to write it in the form 



n 





(14) 



G 



if we do not require the state vectors to be orthogonal. The geometry of the statistical operators, that 
is the state space, allows many extremal decompositions and among them there is a unique orthogonal 
one if the spectrum of D is not degenerate. Non-orthogonal pure states are essentially nonclassical. 
They are between identical and completely different. Jaynes recognized in 1956 that from the point of 
view of information the Schatten decomposition is optimal. He proved that 

S(D) = sup { - A * logK:D = J2 *i A 

i i 

for some convex combination and statistical operators |. 

This is Jaynes contribution to the von Neumann entropy (However, he became known for the very 
strong advocacy of the maximum entropy principle.) 

Certainly the highlight of quantum entropy theory in the 70's was the discovery of subadditivity. 
Before we state it in precise mathematical form, we describe the setting where this property is crucial. 
A one-dimensional quantum lattice system is a composite system of 2N + 1 subsystems, indexed by 
the integers — N < n < N. Each of the subsystems is described by a Hilbert space Ti. n ', those Hubert 
spaces are isomorphic if we assume that the subsystems are physically identical, and even the very finite 
dimensional case dim7Y„ = 2 can be interesting if the subsystem is a "spin 1/2" attached to the lattice 
site n. The finite chain of 2N + 1 spins is described in the tensor product Hilbert space ®rL_ w W n , whose 
dimension is (dmiTt n ) 2N+1 . For a given Hamiltonian Hn and inverse temperature /3 the equilibrium 
state maximizes the free energy functional 

F N (D N ) = Tr N H N D N - ^S(D N ), (15) 

and the actual maximizer is the Gibbs state 

exp(-(3H N ) 

Tr exp(-/3H N ) ' 1 ' 

It seems that this was already known in von Neumann's time but not the thermodynamical limit, 
N — > oo. Rigorous statistical mechanics of spin chains was created in the 70's. Since entropy, energy, 
and free energy are extensive quantities, the infinite system should be handled by their normalized 
versions, called entropy density, energy density, etc. One possibility to describe the equilibrium of the 
infinite system is to carry out a limiting procedure from the finite volume equilibrium states, and another 
is to solve the variational principle for the free energy density on the state space of the infinite system. In 
a translation invariant theory the two approaches lead to the same conclusion, but many technicalities 
are involved. The infinite system is modeled by a C*-algebra and their states are normalized linear 
functionals instead of statistical operators. The rigorous statistical mechanics of quantum spin systems 
was one of the successes of the operator algebraic approach. |ll| and Sec. 15 of are suggested 
further readings about details of quantum spin systems. One of the key points in this approach is the 
definition of entropy density of a state of the infinite system which goes back to the subadditivity of the 
von Neumann entropy. Let Tli and TL2 be possibly finite dimensional Hilbert spaces corresponding to 
two quantum systems. A mixed state of the composite system is determined by a statistical operator 
D12 acting on the tensor product Hi <8> H.2- Assume that we are to measure observables on the first 
subsystem. What is the statistical operator we need? The statistical operator D\ has to fulfill the 
condition 

TriADi = Tr 12 (A I)D 12 (17) 

for any observable A. Indeed, the left hand side is the expectation of A in the subsystem and the right 
hand side is that in the total system. It is easy to see that condition 

(Vl-DiM = E^) ® \<Pi)> D u\il>) ® \<Pi) > (18) 



7 



gives the statistical operator D\, where € Hi and \<pi) is an arbitrary orthonormal basis in H 2 . (In 
fact equation ( [l8|) is obtained from ( |l7j ) by putting m place of A.) It is not difficult to state the 

subadditivity property now: 

5(£>i 2 )<5(£»i)+5(£» 2 ). (19) 
This is a particular case of the strong subadditivity 

S(D 123 ) < S(D 12 ) + S(D 23 ) - S(D 2 ) (20) 

for a system consisting of three subsystems. (We hope that the notation is selfexplanatory, otherwise 
see j|, jl3| or p. 23 in 0.) If the second subsystem is lacking, (|20| ) reduces to (|l9|). (19) was proven 
first by Lieb and Ruskai in 1973 0. 

The measurement conditional expectation was introduced by von Neumann as the basic irreversible 
state change, and it is of the form 

D^^W (21) 

i 

where Pj are pairwise orthogonal projections and J ^ li Pi — I. (We are in the Schrodinger picture.) 
The measurement conditional expectation has the following generalization. Assume that our quantum 
system is described by an operator algebra M whose positive linear functionals correspond to the states. 
A functional r : M. — > (D is a state if t(A) > for any positive observable A and t(I) — 1. An operational 
partition of unity is a finite subset W = {V\, V 2 , ■ ■ . , V n ) of M. such that J^i V*Vi = I. In the Heisenberg 
picture W acts on the observables as 

i 

and the corresponding state change in the Schrodinger picture is 

i 

Let us compare this with the traditional formalism of quantum mechanics. If t(A) = Tr DA, then 
t(J2 V*AV^) = Tr (d£ V* AV t ) = Tr V.DV*) A, 

i i i 

hence the transformation of the statistical operator is 

i 

which is an extension of von Neumann's measurement conditional expectation (pl|). Given a state r of 
the quantum system, the observed entropy of the operational process is defined to be the von Neumann 
entropy of the finite statistical operator 

[^^)]h =1 . 

which is an n x n positive semidefinite matrix of trace 1. If we are interested in the entropy of a state, 
we perform all operational processes and compute their entropy. If the operational process changes the 
state of our system, then the observed operational entropy includes the entropy of the state change. 
Hence we have to restrict ourself to state invariant operational processes when focusing on the entropy 
of the state. The formal definition 



S L (r)=snp{s{[r(V*V j )]l j ^)} 



is the operational (or Lindblad) entropy of the state r if the least upper bound is taken over all opera- 
tional partitions of unity W = (Vi, V%, . . . , V n ) such that 

t(A)^t(Yv*AV z 
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for every observable A. For a statistical operator D we have 

S L (D) = 2S(D), 

and we may imagine that the factor 2 is removable by appropriate normalization, so that we are back to 
the von Neumann entropy. The operational entropy satisfies von Neumann's mixing condition and is a 
concave functional on the states even in the presence of superselection rules. However, it has some new 
features. To see a concrete example, assume that there are two superselection sectors and the operator 
algebra is M 2 (C) © Ma(C), that is, the direct sum of two full matrix algebras. Let a state to be the 
mixture of the orthogonal pure states with weights Ai, where \ipi), IV^) are in the first sector and 
\^3) is in the second. This assumption implies that there is no dynamical change sending \ipi) into l^), 
and superpositions of those states are also prohibited. One computes 

S l (t q ) = -2 Ai log A, 

i 

-(Ai + A 2 ) log(Ai + A 2 ) - (Ai + A 2 + A 3 ) log(Ai + A 2 + A 3 ), 

which shows that this entropy is really sensitive to the superselection sectors. (For further properties 
on S L we refer to pp. 121-124 of §.) 

Nowdays some devices are based on quantum mechanical phenomena, and this holds also for infor- 
mation transmission. For example, in optical communication a polarized photon can carry information. 
Although von Neumann apparently did not see an intimate connection between his entropy formula 
and the formally rather similar Shannon information measure, many years later an information theo- 
retical reinterpretation of von Neumann's entropy is becoming common. Communication theory deals 
with coding, transmission, and decoding of messages. Given a set {di, a 2 , . . . , a„} of messages, a coding 
procedure assigns to each at a physical state, say a quantum mechanical state \tpi). The states are trans- 
mitted and received. During the transmission some noise can enter. The receiver uses some observablcs 
to recover the transmitted message. Shannon's classical model is stochastic, so it is assumed that each 
message a* should be teleported with some probability Ai, J^. A, = 1. Hence in the quantum model the 
input state of the channeling transformation is a mixture; its statistical operator is D ln — '^2 i Pi\ipi){4'i\- 
This is the state we need to transmit, and after transmission it changes into T(D- m ) = D out which is 
formally a statistical operator but may correspond to a state of a very different system. Input and 
output could be far away in space as well as in time. The observer measures the observable Aj and 
Pi = TrD out Aj is the probability with which he concludes the message aj was transmitted. Here we 
need J^j Aj = I and < Ai . More generally, we assume that 

p Ji =TtT(\il> i )(il> i \)A j 

is the probability that the receiver deduces the transmission of the message aj when actually the message 
a, was transmitted. If we forget about the quantum mechanical coding, transmission and decoding 
(measurement), we see a classical information channel in Shannon's sense. According to Shannon, the 
amount of information going through the channel is 

One of the basic problems of communication theory is to maximize this quantity subject to certain 
constraints. For the sake of simplicity, assume that there is no noise. This may happen when the 
channel is actually the memory of a computer; storage of information might be a noiseless channel in 
Shannon's sense. We have then T =identity, D m = D out — D and the inequality 

/ < 3(D) 

holds. If we fix the channel state D and optimize with respect to the probabilities A;, the states 
\ipi) and the observables Aj, then the maximum information transmittable through the channel is 
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exactly the von Neumann entropy. What we are considering is a simple example, probably the simplest 
possible. However, it is well demonstrated that the von Neumann entropy is actually the capacity of 
a communication channel. Recently, there has been a lot of discussion about capacities of quantum 
communication channels, which is outside of the scope of the present article. However, the fact that 
von Neumann's entropy formula has much to do with Shannon theory and possesses an interpretation 
as measure of information must be conceptually clear without entering more sophisticated models and 
discussions. More details are in [|| and a mathematically full account is Q. 

Further sources about quantum entropy and quantum information are |0] , Q and . 
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